YAFIMA: Yet Another Frequent Itemset Mining Algorithm
نویسندگان
چکیده
Efficient discovery of frequent patterns from large databases is an active research area in data mining with broad applications in industry and deep implications in many areas of data mining. Although many efficient frequent-pattern mining techniques have been developed in the last decade, most of them assume relatively small databases, leaving extremely large but realistic datasets out of reach. A practical and appealing direction is to mine for closed or maximal itemsets. These are subsets of all frequent patterns but good representatives since they eliminate what is known as redundant patterns. The practicality of discovering closed or maximal itemsets comes from the relatively inexpensive process to mine them in comparison to finding all patterns. In this paper we introduce a new approach for traversing the search space to discover all frequent patterns, the closed or the maximal patterns efficiently in extremely large datasets. We present experimental results for finding all three types of patterns with very large database sizes never reported before. Our implementation tested on real and synthetic data shows that our approach outperforms similar state-of-the-art algorithms by at least one order of magnitude in terms of both execution time and memory usage, in particular when dealing with very large databases.
منابع مشابه
A New Algorithm for High Average-utility Itemset Mining
High utility itemset mining (HUIM) is a new emerging field in data mining which has gained growing interest due to its various applications. The goal of this problem is to discover all itemsets whose utility exceeds minimum threshold. The basic HUIM problem does not consider length of itemsets in its utility measurement and utility values tend to become higher for itemsets containing more items...
متن کاملAccelerating Closed Frequent Itemset Mining by Elimination of Null Transactions
The mining of frequent itemsets is often challenged by the length of the patterns mined and also by the number of transactions considered for the mining process. Another acute challenge that concerns the performance of any association rule mining algorithm is the presence of „null‟ transactions. This work proposes a closed frequent itemset mining algorithm viz., Closed Frequent Itemset Mining a...
متن کاملResearch on Classification Mining Method of Frequent Itemset
The purpose of association mining is to find the valuable relationships between data sets. The prerequisite of it is to find the frequent itemset first. In view of the existing problems in the present frequent itemset mining, this paper puts forward that data sets should be clustered first, and then the algorithm of frequent itemset mining be applied to every cluster. In this way, algorithm of ...
متن کاملAMKIS: An Algorithm for Association Mining
Mining frequent items and itemsets is a daunting task in large databases and has attracted research attention in recent years. Generating specific itemset, K –itemset having K items, is an interesting research problem in data mining and knowledge discovery. In this paper, we propose an algorithm for finding K itemset frequent pattern generation in large databases which is named as AMKIS. AMKIS ...
متن کاملAccelerating Parallel Frequent Itemset Mining on Graphics Processors with Sorting
Frequent Itemset Mining (FIM) is one of the most investigated fields of data mining. The goal of Frequent Itemset Mining (FIM) is to find the most frequently-occurring subsets from the transactions within a database. Many methods have been proposed to solve this problem, and the Apriori algorithm is one of the best known methods for frequent Itemset mining (FIM) in a transactional database. In ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- JDIM
دوره 3 شماره
صفحات -
تاریخ انتشار 2005